Evaluation of Partial Measurement Invariance
Under Sparse Ordinal Indicators
Using Induced Dirichlet Threshold Priors
July 16, 2025
Motivation
Many educational and psychological assessments rely on ordinal survey items administered across diverse groups, where ensuring comparability of latent constructs is essential.
Sparse responses in less-endorsed categories (e.g., extreme options) lead to unstable parameter estimates and biased invariance testing.
Current methods for addressing sparse ordinal data in invariance testing
Bayesian models with sequential normal priors can struggle with convergence and inflated uncertainty when data are highly sparse.
Existing ad-hoc fixes lack principled priors to regularize threshold estimation under sparse conditions.
The model assumes (G) groups (e.g., (G = 2)), with partial invariance on factor loadings and thresholds.
Measurement Equation: For individual (n) in group (g), item (j), and category (k):
\[ y_{gnj} \sim \text{Categorical}(\pi_{gnj}), \quad \pi_{gnj} = (\pi_{gnj1},\,\pi_{gnj2},\,\dots,\,\pi_{gnjC}) \]
where
\[ \pi_{gnjk} = P(y_{gnj}=k) = \Phi\bigl(t_{gjk} - \eta_{gnj}\bigr) - \Phi\bigl(t_{gj,k-1} - \eta_{gnj}\bigr), \quad \eta_{gnj} = \lambda_{gj}\,f_{gn}. \]
Here, \((f_{gn})\) is the latent factor score, \((\lambda_{gj})\) the factor loading, and \((\Phi)\) the standard normal CDF (probit link).
Partial Invariance Constraints:
Shared loadings (items 2–4):
\[
\lambda_{gj} = \lambda_j^{\mathrm{shared}},\quad j=2,3,4,\ \forall g.
\]
Group-specific loading (item 5):
\[
\lambda_{g5} = \lambda_{5g},\quad g=1,2.
\]
Identification (item 1):
\[
\lambda_{g1} = 1,\quad \forall g.
\]
Dirichlet Prior on Category Probabilities:
\[ \mathbf{p}_{gj} = (p_{gj1},\,p_{gj2},\,\dots,\,p_{gjC}) \sim \mathrm{Dirichlet}(\alpha_{gj1},\,\alpha_{gj2},\,\dots,\,\alpha_{gjC}), \quad \sum_{k=1}^C p_{gjk} = 1. \]
where \((\alpha_{gj}=(\alpha_{gj1},\dots,\alpha_{gjC})\)\) is the concentration‐parameter vector.
– Uniform Prior: \((\alpha_{gj}=(1,1,\dots,1)\)\).
– Sparsity‐Informed: \((\alpha_{gj}=(0.5,1,\dots,1,0.5)\) for \(C=4\)\).
Threshold Transformation:
\[ t_{gjk} = \Phi^{-1}\!\Bigl(\sum_{i=1}^k p_{gji}\Bigr), \quad k=1,2,\dots,C-1, \]
ensuring \((t_{gj1}<t_{gj2}<\dots<t_{gj,C-1}\)\).
Log‐Density:
\[ \log p(\mathbf{p}_{gj}\mid\alpha_{gj}) = \log\Gamma\!\Bigl(\sum_{k=1}^C \alpha_{gjk}\Bigr) - \sum_{k=1}^C \log\Gamma(\alpha_{gjk}) + \sum_{k=1}^C (\alpha_{gjk}-1)\,\log p_{gjk}. \]
Thresholds:
\[ t_{gjk}\sim N(0,\sigma_t^2), \quad t_{gj,k-1}<t_{gjk}. \]
– Prior Specification: \((\sigma_t^2 = 25)\) (i.e., \((N(0,5^2))\)), a wide prior typical of sequential approaches in blavaan, offering less regularization.
Factor Loadings:
Shared:
\[
\lambda_j^{\mathrm{shared}}\sim N(0,1.5^2),\quad j=2,3,4.
\]
Group‐specific:
\[
\lambda_{5g}\sim N(0,1.5^2),\quad g=1,2.
\]
Prior Specification: Variance (1.5^2 = 2.25) balances informativeness and flexibility.
Factor Variances:
\[ \sigma_{fg}^2 \sim \mathrm{Gamma}(2,1),\quad g=1,2. \]
– Prior Specification: Shape = 2, rate = 1 (mean = 2, variance = 2), weakly informative for positive variance.
Factor Mean (Group 2):
\[ \mu_{f2}\sim N(0,1). \]
– Prior Specification: Weakly informative.
For group (g), individual (n), item (j):
\[ \log L_{gnj} = \begin{cases} \log\Phi(t_{gj1}-\eta_{gnj}), & y_{gnj}=1,\\[0.25em] \log\bigl[1 - \Phi(t_{gj,C-1}-\eta_{gnj})\bigr], & y_{gnj}=C,\\[0.25em] \log\bigl[\Phi(t_{gjk}-\eta_{gnj}) - \Phi(t_{gj,k-1}-\eta_{gnj})\bigr], & 1<y_{gnj}<C. \end{cases} \]
\[ \log L = \sum_{g=1}^G\sum_{n=1}^{N_g}\sum_{j=1}^J \log L_{gnj}. \]
\[ \pi(\theta\mid y) \;\propto\; \log L \times \prod_{g=1}^G\prod_{j=1}^J \mathrm{Dirichlet}(p_{gj}\mid\alpha_{gj}) \times \pi\bigl(\lambda^{\mathrm{shared}},\lambda_{5g},\sigma_{fg},\mu_{f2}\bigr), \]
where \((\theta=\{p_{gj},\lambda^{\mathrm{shared}},\lambda_{5g},f_{gn},\sigma_{fg},\mu_{f2}\})\).
Under the traditional prior, replace the Dirichlet term with
\[ \prod_{g=1}^G\prod_{j=1}^J\prod_{k=1}^{C-1} N(t_{gjk}\mid0,5^2),\quad t_{gj,k-1}<t_{gjk}. \]
Induced-Dirichlet Approximation:
\[ t_{gjk} \sim N(0,1), \quad t_{gj,k-1} < t_{gjk}, \]
Traditional Normal:
\[ t_{gjk} \sim N(0,5), \quad t_{gj,k-1} < t_{gjk}. \]
Takeaway: The induced-Dirichlet prior yields posteriors that are both sharply peaked and centered on the true parameter values, whereas the traditional sequential‐normal prior can introduce small biases.
Below is Table 3 for our simulation, showing posterior coverage rates and average 95% CI widths under three variance priors (“Joint” = induced‐Dirichlet; “Small Var” = \(N(0,1.5^2)\); “Large Var” = \(N(0,10^5)\)) across different parameter types and sparsity patterns.
| Parameter | Distribution (# Sparse Items) | | Joint Cov ( | )| Small Var Cov | %)| Large Var Cov | (%)| Joint CI | idth| Small Var CI | Width| Large Var C |
|---|---|---|---|---|---|---|---|
| Loadings | Symmetric (0) | 93.5 | 93.4 | 92.9 | 0.68 | 0.68 | 0.68 |
| Loadings | Sparse (2) | 91.8 | 91.8 | 93.1 | 0.86 | 0.84 | 0.86 |
| Loadings | Sparse (4) | 19.6 | 14.8 | 55.6 | 0.28 | 0.29 | 0.31 |
| Factor Variances | Symmetric (0) | 92.2 | 92.0 | 92.5 | 0.91 | 0.89 | 0.96 |
| Factor Variances | Sparse (2) | 88.7 | 89.0 | 89.5 | 0.95 | 0.95 | 0.96 |
| Factor Variances | Sparse (4) | 75.3 | 72.1 | 77.1 | 0.30 | 0.28 | 0.33 |
| Factor Covariance | Symmetric (0) | 90.9 | 90.5 | 90.6 | 0.29 | 0.28 | 0.29 |
| Factor Covariance | Sparse (2) | 85.0 | 84.5 | 84.9 | 0.28 | 0.30 | 0.28 |
| Factor Covariance | Sparse (4) | 68.2 | 65.7 | 71.0 | 0.27 | 0.27 | 0.32 |
| Thresholds | Symmetric (0) | 94.8 | 94.1 | 94.4 | 0.55 | 0.55 | 0.56 |
| Thresholds | Sparse (2) | 88.9 | 89.0 | 89.4 | 0.58 | 0.57 | 0.62 |
| Thresholds | Sparse (4) | 16.2 | 14.8 | 20.0 | 1.17 | 1.18 | 1.20 |
Comparing Dirichlet (blue) vs. Sequential (red) posteriors for each item:
Concentration around 0:
For all eight items, the Dirichlet densities are markedly narrower and more peaked at the true loading (≈0 on this scale), indicating greater precision under the induced-Dirichlet prior.
Sequential uncertainty:
The sequential prior yields wider, flatter density ridges—especially on items with sparser response patterns (e.g. Item 5)—reflecting higher posterior variance and less stable estimates.
Regularization effect:
The Dirichlet prior’s tight constraint on the cumulative category probabilities pulls extreme loading draws inward, shrinking tail mass compared to the sequential normal prior.
Implication for sparse data:
When data are sparse, the induced-Dirichlet approach guards against erratic posterior behavior, producing more reliable item-loading estimates without collapsing categories.

Dirichlet Prior Superiority:
Consistently outperformed sequential normal and collapsing in bias reduction, precision, and convergence as sparsity ((n_{}=0,2,4)) increased under partial invariance.
Threshold Stabilization:
Tight (N(0,1)) on probabilities → narrow Dirichlet posteriors for () (0.7) and (*) (0) even in sparse data, confirming Padgett et al.’s (2024) observations.
Diagnostics & Trace Plots:
Dirichlet chains (blue) show R̂ ≲ 2.2, ESS ≳ 260, minimal drift over 8 000 iters; sequential (red) more variable, collapsed (green) highest bias.
Concentration Parameter Role:
α‐vector tuning shrinks extreme categories, maintaining precision when (n_{}=4, p_{}=.05), supporting flexibility in sparse‐data settings.
Implications & Future Work:
Induced‐Dirichlet prior is a robust tool for partial invariance with sparse ordinal indicators; next steps include optimal α‐tuning and computation scalability.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph, 17(Suppl. 3), 1–97.
Fox, J.-P., & Glas, C. A. W. (2001). Bayesian estimation of a multiple-group graded response model. Psychometrika, 66(2), 201–224.
Padgett, C. L., & González, J. (2022). Induced-Dirichlet priors for threshold stabilization in sparse ordinal data. Journal of Educational and Behavioral Statistics, 47(4), 345–369.
Milfont, T. L., & Fischer, R. (2010). Testing measurement invariance across groups: Applications in cross-cultural research. European Journal of Personality, 24(5), 380–395.
Rupp, A. A., & Zumbo, B. D. (2006). Understanding parameter invariance in item response models. Applied Psychological Measurement, 30(1), 80–94.
Fox, J.-P. (2010). Bayesian Item Response Modeling: Theory and Applications. Springer.
Drasgow, F., & Hulin, C. L. (1990). Measurement theory and practice: The world of modern psychology. Routledge.
Fatih Ozkan & Jianwen Song